Parallel LDA Through Synchronized Communication Optimizations

نویسندگان

Bingjing Zhang

Bo Peng

Judy Qiu

چکیده

Sophisticated big data machine learning applications are difficult to parallelize because it not only needs to process a big training dataset, it also needs to synchronize big model data in iterations. In parallel LDA, comparing synchronized and asynchronous communication methods under data parallelism and model parallelism, we note that the power-law distribution of word counts in LDA training datasets suggests using synchronized communication optimizations can improve the efficiency of the model update to allow the model to converge faster, shrink the model size, and further reduce the computation time in later iterations. Therefore, we abstracted new synchronized communication operations and developed two new parallel LDA implementations “lda-lgs” and “lda-rtt”. We compare our new approaches to leading implementations in the field on an Intel Haswell cluster with 100 nodes, 4000 threads. In data parallelism, “lda-lgs” can reach higher model likelihood with shorter or similar execution time compared with Yahoo! LDA. In model parallelism, when achieving similar model likelihood, “lda-rtt” can run up to 3.9 times faster compared with Petuum LDA.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

High Performance LDA through Collective Model Communication Optimization

LDA is a widely used machine learning technique for big data analysis. The application includes an inference algorithm that iteratively updates a model until it converges. A major challenge is the scaling issue in parallelization owing to the fact that the model size is huge and parallel workers need to communicate the model continually. We identify three important features of the model in para...

متن کامل

HarpLDA+: Optimizing latent dirichlet allocation for parallel efficiency

Latent Dirichlet Allocation (LDA) is a widely used machine learning technique in topic modeling and data analysis. Training large LDA models on big datasets involves dynamic and irregular computation patterns and is a major challenge to both algorithm optimization and system design. In this paper, we present a comprehensive benchmarking of our novel synchronized LDA training system HarpLDA+ bas...

متن کامل

Optimizations for Parallel Computing Using DataAccess

Given the large communication overheads characteristic of modern parallel machines, optimizations that eliminate, hide or parallelize communication may improve the performance of parallel computations. This paper describes our experience automatically applying communication optimizations in the context of Jade, a portable, implicitly parallel programming language designed for exploiting task-le...

متن کامل

Optimizations for Message Driven Applications on Multicore Architectures

With the growing amount of parallelism available on today’s multicore processors, achieving good performance at scale is challenging. We approach this issue through an alternative to traditional thread-based paradigms for writing shared memory programs, namely message driven multicore programming. We study a number of optimizations that improve the efficiency of message driven programs on multi...

متن کامل

Evaluating Compiler Optimizations for Fortran D

The Fortran D compiler uses data decomposition speciications to automatically translate For-tran programs for execution on MIMD distributed-memory machines. This paper introduces and classiies a number of advanced optimizations needed to achieve acceptable performance; they are analyzed and empirically evaluated for stencil computations. Communication optimizations reduce communication overhead...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2015

Parallel LDA Through Synchronized Communication Optimizations

نویسندگان

چکیده

منابع مشابه

High Performance LDA through Collective Model Communication Optimization

HarpLDA+: Optimizing latent dirichlet allocation for parallel efficiency

Optimizations for Parallel Computing Using DataAccess

Optimizations for Message Driven Applications on Multicore Architectures

Evaluating Compiler Optimizations for Fortran D

عنوان ژورنال:

اشتراک گذاری